Proxying for Unobservable Variables with Internet Documentfrequency
نویسندگان
چکیده
The internet contains billions of documents. We study if there is useful information in the frequency with which different topics are written about. Based on the premise that the occurrence of an event increases its textual frequency, we assess whether internet document-frequency can capture cross-sectional variation in the occurrence-frequency of social phenomena. We characterize the conditions under which such proxying is likely. We successfully proxy for a number of demographic variables at the US city and state levels. We obtain document-frequencybased measures of corruption at the country and state level and replicate the results of previous research studying its covariates. Finally, we illustrate the usefulness of the approach by creating the first index of corruption in US cities. JEL: H00, J11, C81, B40, D73
منابع مشابه
Internet Group Management Protocol (IGMP) / Multicast Listener Discovery (MLD)-Based Multicast Forwarding ("IGMP/MLD Proxying")
Status of This Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Abstract In certain topologies, it is not...
متن کاملManaging energy consumption costs in desktop PCs and LAN switches with proxying, split TCP connections, and scaling of link speed
The IT equipment comprising the Internet in the USA uses about $6 billion of electricity every year. Much of this electricity use is wasted on idle, but fully powered-up, desktop PCs and network links. We show how to recover a large portion of the wasted electricity with improved power management methods that are focused on network issues.
متن کاملTPOT: translucent proxying of TCP
Transparent Layer-4 proxies are being widely deployed in the current Internet to enable a vast variety of applications. These include Web proxy caching, transcoding, service differentiation, and load balancing. To ensure that all IP packets of an intercepted TCP connection are seen by the intercepting transparent proxy, they must sit at focal points in the network. Translucent Proxying of TCP (...
متن کاملDownloading Wisdom from Online Crowds
Downloading Wisdom from Online Crowds The internet and other large textual databases contain billions of documents: is there useful information in the number of documents written about different topics? We propose, based on the premise that the occurrence of a phenomenon increases the likelihood that people write about it, that the relative frequency of documents discussing a phenomenon can be ...
متن کاملA Survey on the Effective Socio-Cultural Factors on Internet Tendency among Shosh Payam Noor University Students
Nowadays internet is one of the main instruments for accessing to information. Social groups use different motivations, but this phenomenon has a special situation among students. Students in addition of amusement, fun and keen motivation to be familiar with unknown world, for educational-science purposes and finding job and educational choices are among main users of internet. The main purpose...
متن کامل